Scaling Down EM Using Graph Based Models
نویسنده
چکیده
The EM algorithm is used extensively in data mining problems that involve uncertainty in the form of missing values or latent variables. It was recently reported in KAIS as being one of the top 10 data mining algorithms. Typical applications include indoor map building, mixture models and image mining. However, as data mining applications extend to deployment beyond traditional desk top machines using typical probabilistic parametric models is infeasible. Resource constrained computational platforms such as hand-held GPS systems, small WiFi devices, and Sony Aibo robots have slow processors and no or limited floating point routines and hence are difficult to implement probabilistic formulations of EM on. In this paper we investigate the EM algorithm with undirected graphs as models. Constraints on the graph topology lead to different model classes and we explore several we believe have practical applications. We prove that the E-step reduces to the st-mincut problem which can be solved in polynomial time. For two of our model classes, we derive a non-heuristic M-step that can be carried out in polynomial time. These algorithms do not require floating point hardware and we demonstrate their performance on several classic problems.
منابع مشابه
A new 2D block ordering system for wavelet-based multi-resolution up-scaling
A complete and accurate analysis of the complex spatial structure of heterogeneous hydrocarbon reservoirs requires detailed geological models, i.e. fine resolution models. Due to the high computational cost of simulating such models, single resolution up-scaling techniques are commonly used to reduce the volume of the simulated models at the expense of losing the precision. Several multi-scale ...
متن کاملParallel Jobs Scheduling with a Specific Due Date: Asemi-definite Relaxation-based Algorithm
This paper considers a different version of the parallel machines scheduling problem in which the parallel jobs simultaneously requirea pre-specifiedjob-dependent number of machines when being processed.This relaxation departs from one of the classic scheduling assumptions. While the analytical conditions can be easily statedfor some simple models, a graph model approach is required when confli...
متن کاملScaling and Fractal Concepts in Saturated Hydraulic Conductivity: Comparison of Some Models
Measurement of soil saturated hydraulic conductivity, Ks, is normally affected by flow patterns such as macro pore; however, most current techniques do not differentiate flow types, causing major problems in describing water and chemical flows within the soil matrix. This study compares eight models for scaling Ks and predicted matrix and macro pore Ks, using a database composed of 50 datasets...
متن کاملDeveloping Multiscale Simulation Models using the Software GroIMP
GroIMP (Growth grammar based Interactive Modelling Platform) is an open-source software tool focused on the development of functional-structural plant and forest stand models. It encompasses a domain-specific programming language, Extended L-System language (XL), which provides standard Java language features and additional rule-based graph rewriting mechanisms. A model with meaningful 3D repre...
متن کاملIncomplete information in scale-free networks
We investigate the effect of incomplete information on the growth process of scale-free networks a situation that occurs frequently e.g. in real existing citation networks. Two models are proposed and solved analytically for the scaling behavior of the connectivity distribution. These models show a varying scaling exponent with respect to the model parameters but no break-down of scaling thus i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008